Hierarchical Reinforcement Learning in Continuous State Spaces Dissertation

نویسنده

  • Hans Vollbrecht
چکیده

Reinforcement learning (RL) has been studied intensively for almost two decades. It has been attractive both for theoretical investigation because of its sound mathematical foundation, and for practical applications because of the possibility of employing a purely empirical learning. In the second half of the nineties, the theory has been given a mature formulation in a unifying view by the work of R. Sutton, A. Barto, D. Bertsekas and J. Tsitsiklis [Sutton & Barto, 1998, Bertsekas & Tsitsiklis, 1996]. On the application side, however, RL has found its major obstacle for a broad acceptance in the problem of long learning times, and much research has been done in the attempt to overcome this problem. On the other side, independently of RL research, behavioral architectures have been developped and applied to real world applications (mostly in robotics) since the second half of the eighties in an atmosphere of great departure and creativeness, overcoming some dogmas of the classical AI approaches. These approaches had a notable success in practical applications, but lacked a theoretical foundation: building for example a successful subsumption architecture [Brooks 1986] for a given control problem is kind of an art. The work of this thesis started from the idea that a behavioral architecture can be defined with sound principles of behavior composition when it is viewed in first a learng problem, and not exclusively as an execution problem as it has mostly been done in the past. Since RL uses a high degree of abstraction for basic concepts such as state, action, and goal-directedness, within a homogeneous, simple theoretical framework, it seemed to be promising to take RL as the unique framework for learning in a behavioral architecture. This choice allows for putting the evaluation of architectures on a solid and transparent basis: optimality in the sense of optimality of an RL-learned solution to a control problem (i.e. maximization of the expected future reward). This definition of optimality finds a coherent extension applicable to hierarchical behavior architectures. The thesis presents basic composition principles of learned behaviors in a

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

K-Means Clustering based Reinforcement Learning Algorithm for Automatic Control in Robots

Reinforcement learning is key research in automatic control, and hierarchical reinforcement learning is a good solution to the problem of the curse of dimensionality. Hierarchical reinforcement learning can only deal with discrete space, but the state and action spaces in robotic automatic control are continuous. In order to deal with continuous spaces in hierarchical reinforcement learning, we...

متن کامل

Hierarchical Policy Gradient Algorithms

Hierarchical reinforcement learning is a general framework which attempts to accelerate policy learning in large domains. On the other hand, policy gradient reinforcement learning (PGRL) methods have received recent attention as a means to solve problems with continuous state spaces. However, they suffer from slow convergence. In this paper, we combine these two approaches and propose a family ...

متن کامل

Hierarchical Functional Concepts for Knowledge Transfer among Reinforcement Learning Agents

This article introduces the notions of functional space and concept as a way of knowledge representation and abstraction for Reinforcement Learning agents. These definitions are used as a tool of knowledge transfer among agents. The agents are assumed to be heterogeneous; they have different state spaces but share a same dynamic, reward and action space. In other words, the agents are assumed t...

متن کامل

Multi-resolution Exploration in Continuous Spaces

The essence of exploration is acting to try to decrease uncertainty. We propose a new methodology for representing uncertainty in continuous-state control problems. Our approach, multi-resolution exploration (MRE), uses a hierarchical mapping to identify regions of the state space that would benefit from additional samples. We demonstrate MRE’s broad utility by using it to speed up learning in ...

متن کامل

Closed-Loop Learning of Visual Control Policies

In this dissertation, I introduce a general, flexible framework for learning direct mappings from images to actions in an agent that interacts with its surrounding environment. This work is motivated by the paradigm of purposive vision. The original contributions consist in the design of reinforcement learning algorithms that are applicable to visual spaces. Inspired by the paradigm of local-ap...

متن کامل

Efficient Model-based Exploration in Continuous State-space Environments

OF THE DISSERTATION Efficient Model-based Exploration in Continuous State-space Environments by Ali Nouri Dissertation Director: Michael L. Littman The impetus for exploration in reinforcement learning (RL) is decreasing uncertainty about the environment for the purpose of better decision making. As such, exploration plays a crucial role in the efficiency of RL algorithms. In this dissertation,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004